Higher-order frameworks for profiling and matching heterogeneous data
نویسنده
چکیده
This Thesis brings together complementary research from higher-order computational logic and workflow systems to investigate software and theoretical frameworks for profiling and matching heterogeneous data. A motivating use case is submission sifting, which matches submitted conference or journal papers to potential peer reviewers based on the similarity between the paper’s abstract and the reviewer’s publications as found in online bibliographic databases. Inspired by e-Science workflows, we introduce the SubSift submission sifting framework for developing webbased research intelligence applications that profile and match heterogeneous textual content from web pages and documents. Abstracting SubSift we define a formal higher-order dataflow framework that ranges over a class of higher-order relations that are sufficiently expressive to represent a wide variety data types and structures. This dataflow model is shown to be embarrassingly parallel. JSONMatch, our proof of concept serial implementation, is used to demonstrate that the combination of this model and higher-order representation provides a flexible approach to analysing heterogeneous data. Finally we propose a theoretical framework for querying structured data, elevating Codd’s relational algebra to a higher-order algebra defined on the basic terms of a higher-order logic. An extension incorporates approximate joins on structured data and is demonstrated to be feasible and have promise for future work.
منابع مشابه
Accelerating high-order WENO schemes using two heterogeneous GPUs
A double-GPU code is developed to accelerate WENO schemes. The test problem is a compressible viscous flow. The convective terms are discretized using third- to ninth-order WENO schemes and the viscous terms are discretized by the standard fourth-order central scheme. The code written in CUDA programming language is developed by modifying a single-GPU code. The OpenMP library is used for parall...
متن کاملPrivacy-preserving Ontology Matching
Increasingly, there is a recognized need for secure information sharing. In order to implement information sharing between diverse organizations, we need privacypreserving interoperation systems. In this work, we describe two frameworks for privacy-preserving interoperation systems. Ontology matching is an indispensable component of interoperation systems. To implement privacy-preserving intero...
متن کاملMultivariate Chemometrics with Regression and Classification Analyses in Heroin Profiling Based on the Chromatographic Data.
The purpose of this work is to promote and facilitate forensic profiling and chemical analysis of illicit drug samples in order to determine their origin, methods of production and transfer through the country. The article is based on the gas chromatography analysis of heroin samples seized from three different locations in Serbia. Chemometric approach with appropriate statistical tools (multip...
متن کاملMultivariate Chemometrics with Regression and Classification Analyses in Heroin Profiling Based on the Chromatographic Data.
The purpose of this work is to promote and facilitate forensic profiling and chemical analysis of illicit drug samples in order to determine their origin, methods of production and transfer through the country. The article is based on the gas chromatography analysis of heroin samples seized from three different locations in Serbia. Chemometric approach with appropriate statistical tools (multip...
متن کاملAn Efficient Algorithm for General 3D-Seismic Body Waves (SSP and VSP Applications)
Abstract The ray series method may be generalized using a ray centered coordinate system for general 3D-heterogeneous media. This method is useful for Amplitude Versus Offset (AVO) seismic modeling, seismic analysis, interpretational purposes, and comparison with seismic field observations.For each central ray (constant ray parameter), the kinematic (the eikonal) and dynamic ray tracing system ...
متن کاملذخیره در منابع من
با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید
عنوان ژورنال:
دوره شماره
صفحات -
تاریخ انتشار 2014